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MULTIPLE DRUG RESISTANCE GENE atrC OF ASPERGILLUS NIDULANS 

Technical Field of the Invention 

This invention relates to recombinant DNA technology. 
In particular, the invention concerns the cloning of nucleic 
acid encoding a multiple drug resistance protein of 
Aspergillus nxdulans. 

Background of the Invention 



Multiple drug resistance (MDR) mediated by the human 
/ndr-1 gene product was initially recognized during the 
course of developing regimens for cancer chemotherapy (Fojo 
et al., 1987, Journal of Clinical Oncology 5:1922-1927). A 
multiple drug resistant cancer cell line exhibits resistance 
to high levels of a large variety of cytotoxic compounds. 
Frequently these cytotoxic compounds will have no common 
structural features nor will they interact with a common 
target within the cell. Resistance to these cytotoxic 
agents is mediated by an outward directed, ATP-dependent 
pump encoded by the /ndr-l gene. By this mechanism, toxic 
levels of a particular cytotoxic compound are not allowed to 
accumulate within the cell. 

MDR- like genes have been identified in a number of 
divergent organisms including numerous bacterial species. 
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the fruit fly Drosophila melanogaster, Plasmodium 
falciparum, the yeast Saccharomyces cerevisiae, 
Caenorhabditis elegans, Leishmania donovanii, marine 
5 sponges, the plant Arabidopsis thaliana, as well as Homo 
sapiens. Extensive searches have revealed several classes 
of compounds that are able to reverse the MDR phenotype of 
multiple drug resistant human cancer cell lines rendering 
them susceptible to the effects of cytotoxic compounds. 

10 These compounds, referred to herein as "MDR inhibitors", 
include for example, calcium channel blockers, anti- 
arrhythmics , antihypertensives , antibiotics , antihistamines , 
immuno- suppressants, steroid hormones, modified steroids, 
lipophilic cations, diterpenes, detergents, antidepressants, 

15 and antipsychotics (Gottesman and Pastan, 1993, Annual 

Review of Biochemistry 62:385-427), Clinical application 
of hximan MDR inhibitors to cancer chemotherapy has become an 
area of intensive focus for research. 

On another front, the discovery and development of 

20 antifungal compounds for specific fungal species has also 
met with some degree of success. Candida species represent 
the majority of fungal infections, and screens for new 
antifungal compounds have been designed to discover anti- 
Candida compounds. During development of antifungal agents, 

25 activity has generally been optimized based on activity 
against Candida albicans. As a consequence, these anti- 
Candida compounds frequently do not possess clinically 
significant activity against other fungal species such as 
Aspergillus nidulans. However, it is interesting to note 

30 that at higher concentrations some anti- Candida compounds 
are able to kill other fungal species such as A. fumigatus 
and A. nidulans. This type of observation suggests that the 
antifungal target (s) of these anti- Candida compounds is 
present in A. fumigatus and A. nidulans as well. Such 

35 results indicate that A. nidulans may possess a natural 
mechanism of resistance that permits them to survive in 
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clinically relevant concentrations of antifungal compounds. 
Until the present invention, such a general mechanism of 
resistance to antifungal compounds in A, nidulans has 
remained undescribed, 

5 

Siimmary of the Invention 

The invention provides, inter alia, isolated nucleic 
acid molecules that comprise nucleic acid encoding a 
10 multiple drug resistance protein from Aspergillus nidulans, 
herein referred to as atrC, vectors encoding atrC, and host 
cells transformed with these vectors. 

In another embodiment, the invention provides a method 
for determining the fungal MDR inhibition activity of a 
15 compound which comprises: 

a) placing a culture of fungal cells, transformed with 
a vector capable of expressing atrC, in the presence of: 

(i) an antifungal agent to which said fungal cell 
is resistant, but to which said fungal cell is sensitive in 

20 its untransf ormed state; 

(ii) a compound suspected of possessing fungal MDR 
inhibition activity; and 

b) determining the fungal MDR inhibition activity of 
said compound by measuring the ability of the antifungal 

25 agent to inhibit the growth of said fungal cell. 

In still another embodiment the present invention 

relates to strains of A. nidulans in which the atrC gene is 

disrupted or otherwise mutated such that the atrC protein is 

not produced in said strains . 
30 In yet another embodiment, the present invention 

relates to a method for identifiying new antifungal 

compounds • 

Detailed Description of the Invention 

35 
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The present invention provides isolated nucleic acid 
molecules that comprise a nucleic acid sequence encoding 
atrC. The cDNA (complementary deoxyribonucleic acid) 
sequence encoding atrC is provided in the Sequence Listing as 
SEQ ID NO: 1. The amino acid sequence of the protein encoded 
by atrC is provided in the Sequence Listing as SEQ ID NO: 2. 

Those skilled in the art will recognize that the 
degenerate nature of the genetic code enables one to 
construct many different nucleic acid sequences that encode 
the amino acid sequence of SEQ ID NO: 2, The cDNA sequence 
depicted by SEQ ID NO: 1 is only one of many possible atrC- 
encoding sequences. Consequently, the constructions 
described below and in the accompanying examples for the 
preferred nucleic acid molecules, vectors, and transf ormants 
of the invention are illustrative and are not intended to 
limit the scope of the invention. 

All nucleotide and amino acid abbreviations used in 
this disclosure are those accepted by the United States 
Patent and Trademark Office as set forth in 3 7 C.F.R. 
§1.822 (b) (1994) . 

The term "vector" refers to any autonomously 
replicating or integrating agent, including but not limited 
to plasmids, cosmids, and viruses (including phage) , 
comprising a nucleic acid molecule to which one or more 
additional nucleic acid molecules can be added. Included in 
the definition of "vector" is the term "expression vector". 
Vectors are used either to amplify and/or to express 
deoxyribonucleic acid (DNA) , either genomic or cDNA, or RNA 
(ribonucleic acid) which encodes atrC, or to amplify DNA or 
RNA that hybridizes with DNA or RNA encoding atrC, 

The term "expression vector" refers to vectors which 
comprise a transcriptional promoter (hereinafter "promoter") 
and other regulatory sequences positioned to drive 
expression of a DNA segment that encodes atrC. Expression 
vectors of the present invention are replicable DNA 
constructs in which a DNA sequence encoding atrC is operably 
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linked to suitable control sequences capable of effecting 
the expression of atrC in a suitable host. Such control 
sequences include a promoter, an optional operator sequence 
to control transcription, a sequence encoding suitable mRNA 
ribosomal binding sites, and sequences which control 
termination of transcription and translation. DNA regions 
are operably linked when they are functionally related to 
each other. For example, a promoter is operably linked to a 
DNA coding sequence if it controls the transcription of the 
sequence, or a ribosome binding site is operably linked to a 
coding sequence if it is positioned so as to permit 
translation. 

The term "MDR inhibition activity" refers to the 
ability of a compound to inhibit the MDR activity of a host 
cell, thereby increasing the antifungal activity of an 
antifungal compound against said host cell. 

In the present invention, atrC may be synthesized by 
host cells transformed with vectors that provide for the 
expression of DNA encoding atrC. The DNA encoding atrC may 
be the natural sequence or a synthetic sequence or a 
combination of both ( "semi -synthetic sequence"). The In 
vitro or in vivo transcription and translation of these 
sequences results in the production of atrC. Synthetic and 
semi -synthetic sequences encoding atrC may be constructed by 
techniques well known in the art. See Brown et al. (1979) 
Methods in Enzymology, Academic Press, N.Y., 68:109-151. 
at rC- encoding DNA, or portions thereof, may be generated 
using a conventional DNA synthesizing apparatus such as the 
Applied Biosystems Model 380A, 380B, 394 or 3948 DNA 
synthesizers (commercially available from Applied 
Biosystems, Inc., 850 Lincoln Center Drive, Foster City, CA 
94404) . 

Owing to the natural degeneracy of the genetic code, 
the skilled artisan will recognize that a sizable y^t 
definite number of nucleic acid sequences may be constructed 
which encode atrC. All such nucleic acid sequences are 
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provided by the present invention. These sequences can be 
prepared by a variety of methods and, therefore, the 
invention is not limited to any particular preparation 
means. The nucleic acid sequences of the invention can be 
5 produced by a number of procedures, including DNA synthesis, 
cDNA cloning, genomic cloning, polymerase chain reaction 
(PGR) technology, or a combination of these approaches. 
These and other techniques are described by Maniatis, et 
al.. Molecular Cloning: A Laboratory Manual, Cold Spring 
10 Harbor Press, Cold Spring Harbor Laboratory, Cold Spring 

Harbor, New York (1989) , or Current Protocols in Molecular 
Biology (F. M. Ausubel et al . , 1989 and supplements). The 
contents of both of these references are incorporated herein 
by reference, 

15 In another aspect, this invention provides the genomic 

DNA encoding atrC, which may be obtained by synthesizing the 
desired portion of SEQ ID No. 1 or by following the 
procedure carried out by Applicants. This procedure involved 
construction of a cosmid genomic DNA library from 

20 Aspergillus nidulans strain OC-1, a mutant derived from 

A42355. This library was screened for genes related to MDRs 
using a homologous probe generated by PCR. Degenerate PCR 
primers directed towards amplification of DNA sequences 
encoding highly conserved regions found in the ATP-binding 

25 domain of several MDR genes were synthesized. PCR using 
these primers and Aspergillus nidulans genomic DNA as 
template produced an approximately 4 00 base pair DNA 
fragment. The DNA sequence of this fragment was highly 
homologous to the ATP-binding region of several MDRs as 

30 predicted. This fragment was used as a hybridization probe 

to identify cosmid clones containing the entire atrC gene. A 
siibclone from one such cosmid containing the entire atrC 
gene was sequenced to ascertain the entire sequence of atrC* 
To effect the translation of atrC-encoding mRNA, one 

35 inserts the natural, synthetic, or semi -synthetic atrC- 
encoding DNA secfuence into any of a large number of 
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appropriate expression vectors through the use of 
appropriate restriction endonucleases and DNA ligases. 
Synthetic and semi -synthetic atrC-encoding DNA sequences can 
be designed, and natural atrC- encoding nucleic acid can be 
modified, to possess restriction endonuclease cleavage sites 
to facilitate isolation from and integration into these 
vectors. Particular restriction endonucleases employed will 
be dictated by the restriction endonuclease cleavage pattern 
of the expression vector utilized- Restriction enzyme sites 
are chosen so as to properly orient the atrC-encoding DNA 
with the control sequences to achieve proper in- frame 
transcription and translation of the atrC molecule. The 
atrC-encoding DNA must be positioned so as to be in proper 
reading frame with the promoter and ribosome binding site of 
the expression vector, both of which are functional in the 
host cell in which atrC is to be expressed. 

Expression of atrC in fungal cells, such as 
Saccharomyces cerevisiae is preferred. Suitable promoter 
sequences for use with yeast hosts include the promoters for 
3-phosphoglycerate kinase (found on plasmid pAP12BD (ATCC . 
53231) and described in U.S. Patent No. 4,935,350, June 19,^ 
1990) or other glycolytic enzymes such as enolase (found on 
plasmid pACl (ATCC 39532)), glyceraldehyde- 3 -phosphate 
dehydrogenase (derived from plasmid pHcGAPCl (ATCC 57090, 
57091) ), hexokinase, pyruvate decarboxylase, 
phosphof ructokinase , glucose- 6 -phosphate isomerase , 3 - 
phosphoglycerate mutase, pyruvate kinase, triosephosphate 
isomerase, phosphoglucose isomerase, and glucokinase. 
Inducible yeast promoters have the additional advantage of 
transcription controlled by growth conditions. Such 
promoters include the promoter regions for alcohol 
dehydrogenase 2, isocytochrpme C, acid phosphotase, 
degradative enzymes associated with nitrogen metabolism, 
metallothionein (contained on plasmid vector pCIi2 8XhoLHBPV 
(ATCC 39475), United States Patent No. 4,840,896), 
glyceraldehyde 3 -phosphate dehydrogenase, and enzymes 
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responsible for maltose and galactose utilization (GALl 
found on plasmid pRyi21 (ATCC 3 7658) and on plasmid pPST5, 
described below) , Suitable vectors and promoters for use in 
yeast expression are further described by R. Hitzeman et 
5 al . , in European Patent Publication No. 73,657A. Yeast 
enhancers such as the UAS Gal enhancer from Saccharomyces 
cerevisiae (found in conjunction with the CYCl promoter on 
plasmid YEpsec- -hllbeta, ATCC 67024) , also are 
advantageously used with yeast promoters. 

10 A variety of expression vectors useful in the present 

invention are well known in the art. For expression in 
Saccharomyces, the plasmid YRp7, for example, (ATCC- 4 0053, 
Stinchcomb et al . , 1979, Nature 282:39; Kingsman et al., 
1979, Gene 7:141 ; Tschemper et al , , 1980, Gene 10:157) is 

15 commonly used. This plasmid contains the trp gene which 
provides a selection marker for a mutant strain of yeast 
lacking the ability to grow in tryptophan, for example ATCC 
44076 or PEP4-1 (Jones, 1977, Genetics 85:12). 

Expression vectors useful in the expression of atrC can 

20 be constiructed by a number of methods. For example, the 
cDNA sequence encoding atrC can be synthesized using DNA 
synthesis techniques such as those described above. Such 
synthetic DNA can be synthesized to contain cohesive ends 
that allow facile cloning into an appropriately digested 

25 expression vector. For example, the cDNA encoding atrC can 
be synthesized to contain J\^otI cohesive ends. Such a 
synthetic DNA fragment can be ligated into a J\rotI -digested 
expression vector such as pYES-2 (Invitrogen Corp., San 
Diego CA 92121) . 

3 0 An expression vector can also be constructed in the 

following manner. Logarithmic phase Aspergillus nidulans 
cells are disrupted by grinding under liquid nitrogen 
according to the procedure of Minuth et al . , 1982 {Current 
Genetics 5:227-231) . Aspergillus nidulans mRNA is 

35 preferably isolated from the disrupted cells using the 
QuickPrep® mRNA Purification Kit (Pharmacia Biotech) 
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according to the instructions of the manufacturer. cDNA is 
produced from the isolated mRNA using the TimeSaver® cDNA 
Synthesis Kit (Pharmacia Biotech) using oligo (dT) according 
to the procedure described by the manufacturer. In this 
5 process an EcoRl/Notl adapter (Stratagene, Inc.) is ligated 
to each end of the double stranded cDNA. The adapter 
modified cDNA is ligated into the vector Lambda Zap^ll^ 
using the Predigested Lambda ZapRlI®/£coRl/CIAP Cloning Kit 
(Stratagene, Inc.) according to the instructions of the 
10 manufacturer to create a cDNA library. 

The library is screened for full-length cDNA encoding 
atrC using a 32p_3^adiolabeled fragment of the atrC gene. In 
this manner, a full-length cDNA clone is recovered from the 
Aspergillus nidulans cDNA library. A full-length cDNA clone 
15 recovered from the library is removed from the Lambda 
Zap^II® vector by digestion with the restriction 
endonuclease Notl which produces a DNA fragment encoding 
atrC. This plasmid further comprises the ColEl origin of 
replication which allows replication in aoli, and the 
20 ampicillin resistance gene for selection of coli 

transf ormants . The expression plasmid further comprises the 
yeast 2fi origin of replication (2^ ori) , allowing 
replication in yeast host cells, the yeast URA3 gene for 
selection of S. cBrevisiae cells transformed with the 
25 plasmid grown in a medium lacking uracil, and the origin of 
replication from the fl filamentous phage. 

In a preferred embodiment of the invention 
Saccharomyces cerevisiae INVScl or INVSc2 cells (Invitrogen 
Corp., Sorrento Valley Blvd., San Diego CA 92121) are 
30 employed as host cells, but numerous other cell lines are 
available for this use. The transformed host cells are 
plated on an appropriate medium under selective pressure 
(minimal medium lacking uracil) . The cultures are then 
incubated for a time and temperature appropriate to the host 
35 cell line employed. 
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The techniques involved in the transformation of yeast 
cells such as Saccharomyces cerevisiae cells are well known 
in the art and may be found in such general references as 
Ausubel et al., Current Protocols in Molecular Biology 
(1989), John Wiley & Sons, New York, NY and supplements. 
The precise conditions under which the transformed yeast 
cells are cultured is dependent upon the nature of the yeast 
host cell line and the vectors employed. 

Nucleic acid, either RNA or DNA, which encodes atrC, or 
a portion thereof, is also useful in producing nucleic acid 
molecules useful in diagnostic assays for the detection of 
atrC mRNA, atrC cDNA, or atrC genomic DNA. Further, nucleic 
acid, either KNA or DNA, which does not encode atrC, but 
which nonetheless is capable of hybridizing with atrC- 
encoding DNA or RNA is also useful in such diagnostic 
assays. These nucleic acid molecules may be covalently 
labeled by known methods with a detectable moiety such as a 
fluorescent group, a radioactive atom or a chemiluminescent 
group. The labeled nucleic acid is then used in 
conventional hybridization assays, such as Southern or 
Northern hybridization assays, or polymerase chain reaction 
assays (PGR) , to identify hybridizing DNA, cDNA, or RNA 
molecules. PGR assays may also be performed using unlabeled 
nucleic acid molecules. Such assays may be employed to 
identify atrG vectors and transf ormants and in in vitro 
diagnosis to detect atrC-like mRNA, cDNA, or genomic DNA 
from other organisms . 

United States Patent Application Serial. No. 08/111680, 
the entire contents of which are hereby incorporated herein 
by reference, describes the use of combination therapy 
involving an antifungal agent possessing a proven spectrum 
of activity, with a fungal MDR inhibitor to treat fungal 
infections. This combination therapy approach enables an 
extension of the spectrum of antifungal activity for a given 
antifungal compound which previously had only demonstrated 
limited clinically relevant antifungal activity. Similarly, 
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compounds with demonstrated antifungal activity can also be 
potentiated by a fungal MDR inhibitor such that the 
antifungal activity of these compounds is extended to 
previously resistant species. To identify compounds useful 
in such combination therapy the present invention provides 
an assay method for identifying compounds with Aspergillus 
nidulans MDR inhibition activity. Host cells that express 
atrC provide an excellent means for the identification of 
compounds useful as inhibitors of Aspergillus nidulans MDR 
activity. Generally, the assay utilizes a culture of a 
yeast cell transformed with a vector which provides 
expression of atrC. The expression of atrC by the host cell 
enables the host cell to grow in the presence of an 
antifungal compound to which the yeast cell is sensitive to 
in the untransf ormed state. Thus, the transformed yeast 
cell culture is grown in the presence of i) an antifungal 
agent to which the untransf ormed yeast cell is sensitive, 
but to which the transformed host cell is resistant, and 
ii) a compound that is suspected of being an MDR inhibitor. 
The effect of the suspected MDR inhibitor is measured by 
testing for the ability of the antifungal compound to 
inhibit the growth of the transformed yeast cell , Such 
inhibition will occur if the suspected Aspergillus nidulans 
MDR inhibitor blocks the ability of atrC to prevent the 
antifungal compound from acting on the yeast cell. An 
illustrative example of such an assay is provided in Example 
3 . 

In order to illustrate more fully the operation of this 
invention, the following examples are provided, but are not 
to be construed as a limitation on the scope of the 
invention. 
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Example 1 

Source of the atrC-Encodinq Genomic DNA and cDNA of 

Aspergillus nxdulans 

5 Complementary DNA encoding atrC (sequence presented in 

SEQ ID NO: 1) may be from a natural sequence, a synthetic 
source or a combination of both ( "semi -synthetic sequence"). 
The in vitro or in vivo transcription and translation of 
these sequences results in the production of atrC, 

10 Synthetic and semi -synthetic sequences encoding atrC may be 
constructed by techniques well known in the art. See Brown 
et al , (1979) Methods in Enzymology, Academic Press, N.Y., 
68:109-151. atrC-encoding DNA, or portions thereof, may be 
generated using a conventional DNA synthesizing apparatus 

15 such as the Applied Biosystems Model 380A, 380B, 384 or 3848 
DNA synthesizers (commercially available from Applied 
Biosystems, Inc., 850 Lincoln Center Drive, Foster City, CA 
944 04) . The polymerase chain reaction is especially useful 
in generating these DNA sequences. PCR primers are 

20 constructed which include the translational start (ATG) and 
translational stop codon (TAG) of atrC. Restriction enzyme 
sites may be included on these PCR primers outside of the 
atrC coding region to facilitate rapid cloning into 
expression vectors. Aspergillus nidulans genomic DNA is 

25 used as the PCR template for synthesis of atrC including 
introns which is useful for expression studies in closely 
related fungi. lii contrast, cDNA is used as the PCR 
template for synthesis of atrC devoid of introns which is 
useful for expression in foreign hosts such as Saccharomyces 

30 cerevisiae or bacterial hosts such as Escherichia coli. 
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Example 2 

Expression of the atrC Protein 

Saccharomyces cerevisiae INVScl cells (Invitrogen 
5 Corp., San Diego CA 92191) are transformed with the plasmid 
containing atrC by the technique described by J, D. Beggs, 
1988, Nature 275:104-109). The transformed yeast cells are 
grown in a broth medium containing YNB/CSM-Ura/raf (YNB/CSM- 
Ura [Yeast Nitrogen Base {Difco Laboratories, Detroit, MI) 

10 supplemented with CSM-URA (Bio 101, Inc.)] supplemented with 
4% raffinose) at 28«C in a shaker incubator until the 
culture is saturated. To induce expression of atrC, a 
portion of the culture is used to inoculate a flask 
containing YNB/CSM-Ura medium supplemented with 2% galactose 

15 (YNB/CSM-Ura'/gal) rather than raffinose as the sole carbon 

source. The inoculated flask is incubated at 2B'>C for about 
16 hours. 

Example 3 

20 Antifungal Potentiator Assay 

Approximately 1 x 10^ cells of a Saccharomyces 
cerevisiae INVScl culture expressing atrC are delivered to 
each of several agar plates containing YNB/CSM-Ura/gal . The 

25 agar surface is allowed to dry in a biohazard hood. 

An cuitifungal compound that the untransf ormed yeast 
cell is typically sensitive to is dissolved in an 
appropriate solvent at a concentration that is biologically 
effective. Twenty ^1 of the solution is delivered to an 

30 antibiotic susceptibility test disc (Difco Laboratories, 

Detroit, MI) . After addition of the antifungal solution the 
disc is allowed to air dry in a biohazard hood. When dry, 
the disc is placed on the surface of the petri plates 
containing the transformed Saccharomyces cerevisiae INVScl 

35 cells. 
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Compounds to be tested for the ability to inhibit atrC 
are dissolved in dimethylsulf oxide (DMSO) . The amount of 
compound added to the DMSO depends on the solubility of the 
individual compound to be tested. Twenty |il of the 
5 suspensions containing a compound to be tested are delivered 
to an antibiotic susceptibility test disc (Difco 
Laboratories, Detroit, MI) . The disc is then placed on the 
surface of the dried petri plates containing the transformed 
SaccharomycGS cerevisiae INVScl cells approximately 2 cm 
10 from the antifungal -containing disc. Petri plates 

containing the two discs are incubated at 28^C for about 16- 
4 8 hours . 

Following this incubation period, the petri plates are 
examined for zones of growth inhibition around the discs. A 

15 zone of growth inhibition near the antifungal disc on the 

test plate ijidicates that the compound being tested for MDR 
inhibition activity blocks the activity of atrC and allows 
the antifungal compound to inhibit the growth of the yeast 
host cell , Such compounds are said to possess MDR 

20 inhibition activity. Little or no zone of growth inhibition 
indicates that the test compound does not block MDR activity 
and, thus, atrC is allowed to act upon the antifungal 
compound to prevent its activity upon the host cell, 

25 Example 4 

Screen For Novel Antifungal Compounds 

A plasmid molecule is constructed which contains DNA 
sequence information required for replication and genetic 

30 transformation in E. coli (e.g. ampicillin resistance) . The 
plasmid also comprises DNA sequences encoding a marker for 
selection in fungal cells (e.g. hygromycin B 
phosphotransferase, phleomycin resistance, G418 resistance) 
under the control of an A, nidulans promoter. Additionally, 

35 the plasmid contains an internal portion of the atrC gene 

(e.g. about 3000 base pairs which lack 500 base pairs at the 
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N-terminal end, and about 500 base pairs at the C-terminal 
end of the coding region specified by SEQ ID NO:l) . The atrC 
gene fragment enables a single crossover gene disruption 
when transformed or otherwise introduced into A. nidulans. 
5 Alternatively, a 5 kilobase pair to 6 kilobase pair 

region of A. nidulans genomic DNA containing the atrC gene 
is subcloned into the aforementioned plasmid. Then, a 
central portion of the atrC gene is removed and replaced 
with a selectable marker, such as hyromycin B 

10 phosphotransferase, for a double crossover gene replacement. 

Gene disruption and gene replacement procedures for A. 
nidulans are well known in the art {See e.g. May et al, J". 
Cell Biol. 101, 712, 1985; Jones and Sealy-Lewis, Curr. 
Genet. 17, 81, 1990). Trans formants are recovered on an 

15 appropriate selection medium, for example, hygromycin (if 

hygromycin B gene is used in the construction of disruption, 
cassette) . Gene replacement, or gene disruption, is verified 
by any suitable method, for example, by Southern blot 
hybridization. 

20 Gene disruption or gene replacement strains are 

rendered hypersensitive to antifungal compounds, and are 
useful in screens for new antifungal compounds in whole cell 
growth inhibition studies. 
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CLAIMS 

We claim: 

1. A DNA compound that comprises an isolated DNA sequence 
encoding SEQ ID NO: 2. 

2 . The DNA compound of Claim 1 which comprises the isolated 
DNA sequence which is SEQ ID NO: 1. 

3. A vector comprising an isolated DNA sequence of Claim 1, 

4. A vector comprising an isolated DNA sequence of Claim 2. 

5. A method for constructing a transformed host cell 
capable of expressing SEQ ID NO: 2, said method comprising 
transforming a host cell with a recombinant DNA vector that 
comprises an isolated DNA sequence of Claim 1. 

6. A method for expressing SEQ ID NO: 2 in a transformed 
host cell said method comprising culturing said transformed 
host cell of Claim 5 under conditions suitable for gene 
expression. 

7 . An isolated DNA molecule of Claim 1 or a portion 
thereof, which is labeled with a detectable moiety. 

8 . A host cell containing the vector of Claim 3 . 

9. A host cell containing the vector of Claim 4. 

10 ♦ A method for determining the fungal MDR inhibition 
activity of a compound which comprises: 

a) placing a culture of fungal cells, transformed with 
a vector capable of expressing atrC, in the presence of: 
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(i) an antifungal agent to which said fungal cell 
is resistant, but to which said fungal cell is sensitive in 
its untransf ormed state; 

(ii) a compound suspected of possessing 
5 Aspergillus nidulans MDR inhibition activity; and 

b) determining the fungal MDR inhibition activity of 
said compound by measuring the ability of the antifungal 
agent to inhibit the growth of said fungal cell. 

10 11. A method of Claim 10 wherein the fungal cell is 
ScLCcharomyces cerevisiae. 

12. The protein of SEQ ID No. 2 in purified form. 

15 13. A strain of A. nidulans wherein said strain carries a 
gene disruption or gene replacement at the atrC locus such 
that said strain does not produce the atrC protein product. 

14. A method for identifying an antifungal compound 
20 comprising the steps of : 

a, culturing in the presence of a test compound a 
strain of claim 13; 

b. culturing said strain in the absence of said test 
compound ; and 

25 c. comparing the growth of said strain in step (a) 

with the growth in step (b) . 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Eli Lilly and Company 

(ii) TITLE OF INVENTION: Multiple Drug Resistance Gene c 
Aspergillus Nidulans 

(iii) NUMBER OF SEQUENCES: 3 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Eli Lilly and Company 

(B) STREET: Lilly Corporate Center 

(C) CITY: Indianapolis 

(D) STATE: Indiana 

(E) COUNTRY: U.S. 

(F) * ZIP: 46285 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.3 0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Webster, Thomas D. 

(B) REGISTRATION NUMBER: 39,872 

(C) REFERENCE/DOCKET NUMBER: X- 11765 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 317-276-3334 

(B) TELEFAX: 317-276-2763 



(2) INFORMATION FOR SEQ ID NO:l: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 927 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..3924 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATG CGG AGG CTC GGA CCC TCA GTT TAG CGG CGT TCG GAC GTG TCT ACT 
48 

Met Arg Arg Leu Gly Pro Ser Val Tyr Arg Arg Ser Asp Val Ser Thr 

1 5 10 15 

TTA AAA AAA AAG AAG CTC TCG TTG TCA CCA TCG TCA TGC TCG ACC GCG 
96 

Leu Lys Lys Lys Lys Leu Ser Leu Ser Pro Ser Ser Cys Ser Thr Ala 
20 25 30 

GCT GTA CCA GAC TCC GTC TCA GGA CGA GTC GAC CAC CAG TGT ACC ATG 
144 

Ala Val Pro Asp Ser Val Ser Gly Arg Val Asp His Gin Cys Thr Met 
35 40 45 

CAC GGA GGC GCC TCT GGT CGA GGA AGG GGA GGA AGC AAG CTT TGG CGC 
192 

His Gly Gly Ala Ser Gly Arg Gly Arg Gly Gly Ser Lys Leu Trp Arg 
50 55 60 

ATA CAA GGT GCC AAG CTG ATA TGC TCG CGC AAA AGA GGA TCT TTA CAT 
240 

lie Gin Gly Ala Lys Leu lie Cys Ser Arg Lys Arg Gly Ser Leu His 
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65 

TCG CCG GCA GGA 
288 

Ser Pro Ala Gly 



CAT GCG CCT CTG 
336 

His Ala Pro Leu 
100 

AGT TCG TCA CCG 
384 

Ser Ser Ser Pro 
115 

CAG ACT TTC GTG 
432 

Gin Thr Phe Val 
130 

TAC CTG GGC ATC 
480 

Tyr Leu Gly lie 
145 

CTA ACC TAC GCG 
528 

Leu Thr Tyr Ala 



CTC AAA GCG GCG 
576 

Leu Lys Ala Ala 
180 

GGG GGC TCC ATC 
624 

Gly Gly Ser lie 
195 

GCC GGC GCC TCG 



70 

CAG AAC TTA TCC 

Gin Asn Leu Ser 
85 

GAG CAG GAA TTG 
Glu Gin Glu Leu 

TCA TCA CCG ATT 

Ser Ser Pro lie 
120 

ACA ATG CCG CCG 

Thr Met Pro Pro 
135 

GCG CGG CTC GTC 

Ala Arg Leu Val 
150 

GCC TAC CGC ATC 

Ala Tyr Arg lie 
165 

CTG AGC CAA GAA 
Leu Ser Gin Glu 

GCC GCG CAG GCA 

Ala Ala Gin Ala 
200 

GAT AAG ATC GGT 



75 

TTC AGG CCG TTG 

Phe Arg Pro Leu 
90 

CGC TTC AAA ACC 

Arg Phe Lys Thr 
105 

TCA CCA ACG GAA 
Ser Pro Thr Glu 

AGT TGG CGT ATC 

Ser Trp Arg lie 
140 

CTC TCC TAC ACC 

Leu Ser Tyr Thr 
155 

GTC CGC AAT ATC 

Val Arg Asn lie 
170 

GTG GCA TAC TAC 

Val Ala Tyr Tyr 
185 

ACT TCG AAC GGC 
Thr Ser Asn Gly 

CTT CTC TTC CAG 
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CTA TCC TTG CTG 

Leu Ser Leu Leu 
95 

TCA TCT TCG GCC 

Ser Ser Ser Ala 
110 

TCT CAA CGC CGG 

Ser Gin Arg Arg 
125 

CTC TAC TTT GTA 
Leu Tyr Phe Val 

TAC AAC ACC CTC 

Tyr Asn Thr Leu 
160 

CGA CAC GCC TAT 

Arg His Ala Tyr 
175 

GAT TTC GGT AGC 

Asp Phe Gly Ser 
190 

AAA CTG ATC CAG 

Lys Leu lie Gin 
205 

GGC CTC GCA GCA 
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672 

Ala Gly Ala Ser 
210 

TTC GTG ACG CTT 
720 

Phe Val Thr Leu 
225 

CTG ATC TGC ATC 
768 

Leu lie Cys lie 



GTA GCT GCG GTC 
816 

Val Ala Ala Val 
260 

GCG GAG GCG AAT 
864 

Ala Gin Ala Asn 
275 

GTT CAT GCT TTT 
912 

Val His Ala Phe 
290 

TAT CTG GTG GAG 
960 

Tyr Leu Val Glu 
305 

GGT CTT CTC TTC 

1008 
Gly Leu Leu Phe 



CTG GCG TTT TGG 

1056 
Leu Ala Phe Trp 
34 0 



Asp Lys lie Gly 
215 

TCA TTA TCG CGT 

Ser Leu Ser Arg 
230 

TGC ATC CCC GTA 

Cys lie Pro Val 
245 

GAG GCT GGG CAC 

Glu Ala Gly His 

TCG TTT GCC GAG 

Ser Phe Ala Glu 
280 

GGG ATG CGG GAT 

Gly Met Arg Asp 
295 

GCG CAT AAG GTC 

Ala His Lys Val 
310 

TCG GCG GAG TAT 

Ser Ala Glu Tyr 
325 

CAG GGG ATC CAT 
Gin Gly lie His 



Leu Leu Phe Gin 
220 

TTG TGG TGC AAG 

Leu Trp Cys Lys 
235 

GCC ACG ATC GGC 

Ala Thr lie Gly 
250 

GAG ACG AGG ATC 

Glu Thr Arg He 
265 

GGT ATT CTG GCG 
Gly He Leu Ala 

AGT CTG GTC AGG 

Ser Leu Val Arg 
300 

GGT AAG AAG ATC 

Gly Lys Lys He 
315 

ACG ATC ATC TAC 

Thr He He Tyr 
330 

ATG TTC GGC AGG 

Met Phe Gly Arg 
345 

iqe 4 



Gly Leu Ala Ala 

TGG MiA CTC ACT 

Trp Lys Leu Thr 
240 

ACG ACG GGG GTG 

Thr Thr Gly Val 
255 

TTG CAG ATA CAT 

Leu Gin He His 
270 

GGT GTG AAG GCT 

Gly Val Lys Ala 
285 

AAG TTT GAT GAA 
Lys Phe Asp Glu 

TCG CCG CTG CTT 

Ser Pro Leu Leu 
320 

CTT GGA TAT <3GG 

Leu Gly Tyr Gly 
335 

GGG GAG ATT GGG 

Gly Glu He Gly 
350 
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ACT GCT GGG GAT ATC TTT ACQ GTT TTG CTC TCT GTC GTC ATT GCG TCA 
1104 

Thr Ala Gly Asp He Phe Thr Val Leu Leu Ser Val Val He Ala Ser 
355 360 365 

ATC AAC CTG ACT TTA CTG GCG CCG TAT TCA ATT GAA TTT AGC AGG GCT 
1152 

He Asn Leu Thr Leu Leu Ala Pro Tyr Ser He Glu Phe Ser Arg Ala 
370 375 380 

GCT TCA GCG GCT GCG CAA CTG TTC CGA CTC ATA GAT CGA GAG TCT GAA 
1200 

Ala ser Ala Ala Ala Gin Leu Phe Arg Leu He Asp Arg Glu Ser Glu 

395 400 

ATC AAC CCA TAC GGG AAG GAA GGC CTC GAG CCG GAA CGG GTA TTA GGC 
1248 

He Asn Pro Tyr Gly Lys Glu Gly Leu Glu Pro Glu Arg Val Leu Gly 

405 410 415 

QAC GTC GAG CTC GAG AAT GTT ACG TTC TCG TAT CCC ACG AGG CCG GGG 
1296 

Asp Val Glu Leu Glu Asn Val Thr Phe Ser Tyr Pro Thr Arg Pro Gly 
'*20 425 430 

ATT ACC GTC CTC GAT AAC TTC AGT CTC AAG GTC CCA GCG GGA AAG GTG 
1344 

He Thr Val Leu Asp Asn Phe Ser Leu Lys Val Pro Ala Gly Lys Val 
435 440 445 

ACT GCC CTG GTA GGG CAA TCT GGA TCG GGG AAG AGC ACG ATC GTG GGA 
1392 

Thr Ala Leu Val Gly Gin Ser Gly Ser Gly Lys Ser Thr He Val Gly 
450 455 460 

TTG CTC GAG CGG TGG TAT AAC CCG ACC TCT GGG GCG ATC AGA CTC GAC 
1440 

Leu Leu Glu Arg Trp Tyr Asn Pro Thr Ser Gly Ala He Arg Leu Asp 

470 475 480 

GGG AAC CTG ATC AGT GAG CTC AAT GTT GGC TGG CTG CGG AGG AAT GTG 
1488 
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Gly Asn Leu lie 



CGG CTC GTA CAG 

1536 
Arg Leu Val Gin 
500 

AAC ATC AGG TAG 
1584 

Asn lie Arg Tyr 
515 

QAA GAG CAG ATG 

1632 
Glu Glu Gin Met 
530 

CAC GAA TTC ATC 

1680 
His Glu Phe lie 
545 

GAA CGG GGT GGT 

1728 
Glu Arg Gly Gly 



GCC CGC AGC GTC 

1776 
Ala Arg Ser Val 
580 

ACC AGT GCT CTT 

1824 
Thr Ser Ala Leu 
595 

GAC AAA GCA GCT 

1872 
Asp Lys Ala Ala 
610 



Ser Glu Leu Asn 
485 

CAG GAG CCG GTG 

Gin Glu Pro Val 

GGC CTC GTC GGG 

Gly Leu Val Gly 
520 

GAA CGG GTG CAG 

Glu Arg Val Gin 
535 

TCT GAG CTG ACC 

Ser Glu Leu Thr 
550 

CTG CTT TCT GGA 

Leu Leu Ser Gly 
565 

GTT TCT CAA CCG 
Val Ser Gin Pro 

GAT CCG CAT GCA 

Asp Pro His Ala 
600 

GAG GGG CGC ACG 

Glu Gly Arg Thr 
615 



Val Gly Trp Leu 
490 

CTC TTC CAG GGA 

Leu Phe Gin Gly 
505 

ACG CCG TGG GAG 
Thr Pro Trp Glu 

GAG GCC GCG AAG 
Glu Ala Ala Lys 

GAC GGA TAC GAT 

Asp Gly Tyr Asp 
555 

GGC CAG AAG CAG 

Gly Gin Lys Gin 
570 

AAG GTC CTT CTG 

Lys Val Leu Leu 
585 

GAG ACG ATT GTT 
Glu Thr He Val 

ACG ATT GTC ATT 

Thr He Val He 
620 
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Arg Arg Asn Val 
495 

AGC GTG TTC GAT 

Ser Val Phe Asp 
510 

AAT GCC TCT CGG 

Asn Ala Ser Arg 
525 

TTG GCA TAT GCG 
Leu Ala Tyr Ala 

ACG CTG ATC GGC 

Thr Leu He Gly 
560 

CGG GTT GCG ATT 

Arg Val Ala He 
575 

CTG GAT GAA GCA 

Leu Asp Glu Ala 
590 

CAG AAG GCT CTG 

Gin Lys Ala Leu 
605 

GCT CAC AAA CTT 
Ala His Lys Leu 
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GCT ACG ATC CGC AAG GCG GAC AAT ATC GTT GTC ATG AGC AAG GGT CAC 
1920 

Ala Thr lie Arg Lys Ala Asp Asn lie Val Val Met Ser Lys Gly His 

630 635 640 

ATT GTC GAG CAA GGC ACA CAC GAG TCA CTG ATA GCC AAG GAC GGC GTC 
1968 

lie Val Glu Gin Gly Thr His Glu Ser Leu He Ala Lys Asp Gly Val 

650 655 

TAT GCC GGT CTG GTC AAA ATC CAG AAC CTG GCA GTG AAT GCT TCA GCA 
2016 

Tyr Ala Gly Leu Val Lys He Gin Asn Leu Ala Val Asn Ala Ser Ala 
660 665 670 

CAT GAC AAT GTA AAT GAG GAG GGT GAA GGC GAA GAT GTC GCT CTC CTG 
2064 

His Asp Asn Val Asn Glu Glu Gly Glu Gly Glu Asp Val Ala Leu Leu 
675 - 680 685 

GAG GTC ACC GAA ACA GCA GTA ACC CGC TAC CCA ACC TCC ATC CGC GGT 
2112 

Glu Val Thr Glu Thr Ala Val Thr Arg Tyr Pro Thr Ser He Arg Gly 
690 695 700 

CGA ATG AAC TCC ATA AAG GAC CGC GAC GAT TAT GAG AAC CAC AAG CAC 
2160 

Arg Met Asn Ser He Lys Asp Arg Asp Asp Tyr Glu Asn His Lys His 
''^^ 710 715 720 

ATG GAT ATG CTG GCC GCC TTA GCT TAT CTC GTC CGC GAA TGT CCA GAA 
2208 

Met Asp Met Leu Ala Ala Leu Ala Tyr Leu Val Arg Glu Cys Pro Glu 

725 730 735 

CTG AAA TGG GCC TAT CTC GTC GTG CTA CTG GGG TGT CTT GGT GGT TGC 
2256 

Leu Lys Trp Ala Tyr Leu Val Val Leu Leu Gly Cys Leu Gly Gly Cys 
740 745 750 

GCC ATG TAC CCC GGC CAA GCT ATC TTG ATG TCT CGC GTT GTC GAG GTC 
2304 

Ala Met Tyr Pro Gly Gin Ala He Leu Met Ser Arg Val Val Glu Val 
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755 

TTC ACG CTC TCG 

2352 
Phe Thr Leu Ser 
770 

AGT ATG CTG ATC 

2400 
Ser Met Leu lie 
785 

GTC GGA TAT GCA 

2446 
Val Gly Tyr Ala 



CGA CGC CTC ATT 

2496 
Arg Arg Leu lie 
820 

GAC CGT GAA GAG 
2544 

Asp Arg Glu Glu 
835 

TAG CCG CAT GCA 

2592 
Tyr Pro His Ala 
850 

GTG ATT GCT GTC 

2640 
Val lie Ala Val 
865 

TTC TCC TGG AAA 

2688 
Phe Ser Trp Lys 



CTT GTC GGT GCT 



760 

GGA GAC GCT ATG 

Gly Asp Ala Met 
775 

GTT CTC GCG GCC 

Val Leu Ala Ala 
790 

ACC AAC ACT ATA 

Thr Asn Thr lie 
805 

CTG CAC GAC ATG 
Leu His Asp Met 

AAC ACT ACC GGT 

Asn Thr Thr Gly 
840 

ATT CTC GAA CTG 

lie Leu Glu Leu 
855 

CTG CAG GTG GTA 

Leu Gin Val Val 
870 

CTA GGG CTG GTC 

Leu Gly Leu Val 
885 

GGG ATG GTA CGA 



CTA GAC AAA GGA 

Leu T^p Lys Gly 
780 

GGG TGT CTG ATC 

Gly Cys Leu lie 
795 

GCC CAG CAT CTT 

Ala Gin His Leu 
810 

CTG CGA CAG GAT 

Leu T^g Gin Asp 
825 

GCG CTG GTA AGC 
Ala Leu Val Ser 

ATG GGC TAC AAC 

Met Gly Tyr Asn 
860 

ACC TGT GGC ATC 

Thr Cys Gly lie 
875 

GTT GTC TTT GGC 

Val Val Phe Gly 
890 

ATC CGC GTC GAC 



765 

GAC TTC TAT GCC 
Asp Phe Tyr Ala 

TGT TAC TTA GCT 

Cys Tyr Leu Ala 
800 

AGT CAT TGG TTT 

Ser His Trp Phe 
815 

ATC CAG TTC TTT 

lie Gin Phe Phe 
830 

CGT ATC GAT TCG 

Arg lie Asp Ser 
845 

ATC GCC CTG GTC 
lie Ala Leu Val 

CTG GCC ATT GCA 

Leu Ala lie Ala 
880 

GGT ATT CCA CCC 

Gly lie Pro Pro 
895 

TCC CGC CTC GAT 
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2736 

Leu Val Gly Ala Gly Met Val Arg He Arg Val Asp Ser Arg Leu Asp 
900 905 910 

CGC CAG ACA TCG AAG AAA TAT GGC ACC AGC TCG TCC ATT GCC TCT GAA 
2784 

Arg Gin Thr Ser Lys Lys Tyr Gly Thr Ser Ser Ser He Ala Ser Glu 
915 920 925 

GCT GTA AAC GCT ATC CGG ACC GTT TCG TCC CTT GCA ATC GAA GAG ACG 
2832 

Ala Val Asn Ala He Arg Thr Val Ser Ser Leu Ala He Glu Glu Thr 
930 935 940 

GTG CTA CGT CGA TAG ACG GAG GAA CTA GAC CAC GCT GTC TCG TCT TCG 
2880 

Val Leu Arg Arg Tyr Thr Glu Glu Leu Asp His Ala Val Ser Ser Ser 
5-*^ 950 955 960 

GTG AAA CCC ATG GCT GCC ACG ATG ATT TGT TTC GGG CTG ACG CAG TGC 
2928 

Val Lys Pro Met Ala Ala Thr Met He Cys Phe Gly Leu Thr Gin Cys 

965 970 975 

ATT GAG TAG TGG TTT CAG GCG CTG GGA TTC TGG TAT GGG TGT CGT CTT 
2976 

He Glu Tyr Trp Phe Gin Ala Leu Gly Phe Trp Tyr Gly Cys Arg Leu 
980 985 990 

GTG TCG CTG GGG GAG ACT AGC ATG TAT AGT TTC TTT GTC GCA TTC CTC 
3024 

Val Ser Leu Gly Glu Thr Ser Met Tyr Ser Phe Phe Val Ala Phe Leu 
995 1000 1005 

AGT GTG TTC TTT GCG GGT CAG GCG TCA GCG CAG CTG TTC CAG TGG TCG 
3072 

Ser Val Phe Phe Ala Gly Gin Ala Ser Ala Gin Leu Phe Gin Trp Ser 
1010 1015 1020 

ACC AGT ATT ACA AAG GGA ATC AAT GCG ACG AAC TAC ATC GCT TGG TTG 
3120 

Thr Ser He Thr Lys Gly He Asn Ala Thr Asn Tyr He Ala Trp Leu 
1025 1030 1035 1040 
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CAC CAG CTC CAA CCA ACA GTG CGC GAG ACG CCG GAG AAC CAC GAT AAA 
3168 

His Gin Leu Gin Pro Thr Val Arg Glu Thr Pro Glu Asn His Asp Lys 

1045 1050 1055 

GGC CCT GGA TCT GGG GCG CCG ATT GCT ATG GAC AAT GTG CGC TTC TCG 
3216 

Gly Pro Gly Ser Gly Ala Pro lie Ala Met Asp Asn Val Arg Phe Ser 
1060 1065 1070 

TAC CCT CTA CGG CCA GAC GCC CCT ATC CTG AAA GGG GTG AAT CTG AAG 
3264 

Tyr Pro Leu Arg Pro Asp Ala Pro lie Leu Lys Gly Val Asn Leu Lys 
1075 1080 1085 

ATA AAC AAA GGC CAA TTC ATC GCT TTC GTC GGC TCC TCC GGC TGG GGG 
3312 

lie Asn Lys Gly Gin Phe lie Ala Phe Val Gly Ser Ser Gly Cys Gly 
1090 1095 1100 

AAA TCC ACC ATG ATT GCC ATG CTC GAG CGC TTC TAC GAT CCA ACA ACA 
3360 

Lys Ser Thr Met lie Ala Met Leu Glu Arg Phe Tyr Asp Pro Thr Thr 
1105 1110 1115 1120 

GGG AGC ATC ACA ATC GAC GCT TCC ACC CTC ACC GAC ATA AAC CCC ATA 
3408 

Gly Ser lie Thr lie Asp Ala Ser Thr Leu Thr Asp lie Asn Pro lie 

1125 1130 1135 

TCC TAC CGA AAT ATT GTG GCA CTG GTG CAG CAA GAG CCA ACC CTT TTC 
3456 

Ser Tyr Arg Asn lie Val Ala Leu Val Gin Gin Glu Pro Thr Leu Phe 

1140 1145 1150 

CAA GGG ACA ATA CGG GAC 7VAC ATC TCG CTT GGC GAT GCA <3TG AAG TCC 
3504 

Gin Gly Thr lie Arg Asp Asn lie Ser Leu Gly Asp Ala Val Lys Ser 
1155 1160 1165 

GTG TCT GAT GAG CAG ATT GAG TCG GCC CTC CGC GCA <3CT AAT GCC TGG 
3552 
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Val Ser Asp Glu Gin He Glu Ser Ala Leu Arg Ala Ala Asn Ala Trp 
1170 1175 1180 

GAC TTT GTC TCC TCA TTG CCG CAG GGG ATC TAG ACG CCC GCT GGC TCA 
3600 

Asp Phe Val Ser Ser Leu Pro Gin Gly He Tyr Thr Pro Ala Gly Ser 
1185 1190 1195 1200 

GGC GGG TCC CAA CTC TCT GGG GGG CAG CGG CAA CGC ATT GCC ATT GCC 
3648 

Gly Gly Ser Gin Leu Ser Gly Gly Gin Arg Gin Arg He Ala He Ala 

1205 1210 1215 

CGC GCG CTC ATC CGA GAT CCA AAG ATC TTA CTC CTT GAC GAG GCT ACG 
3696 

Arg Ala Leu He Arg Asp Pro Lys He Leu Leu Leu Asp Glu Ala Thr 
1220 1225 1230 

AGT GCC CTG GAT ACA GAG AGT GAG AAG ATC GTG CAG AAG GCT CTC GAG 
3744 

Ser Ala Leu Asp Thr Glu Ser Glu Lys He Val Gin Lys Ala Leu Glu 
1235 1240 1245 

GGG GCG GCC AGG GAC GGG GAC CGG CTT ACG GTT GCT GTT GCG CAT CGA 
3792 

Gly Ala Ala Arg Asp Gly Asp Arg Leu Thr Val Ala Val Ala His Arg 
1250 1255 1260 

TTA AGC ACG ATT AAG GAT GCT AAT GTT ATC TGT GTA TTC TTT GGA GGA 
3840 

Leu Ser Thr He Lys Asp Ala Asn Val He Cys Val Phe Phe Gly Gly 
1265 1270 1275 1280 

AAG ATT GCG GAG ATG GGA ACG CAT CAA GAG TTA ATA GTT AGG GGG GGG 
3888 

Lys He Ala Glu Met Gly Thr His Gin Glu Leu He Val Arg Gly Gly 

1285 1290 1295 

CTG TAT AGA CGG ATG TGT GAG GCG CAG GCC TTG GAC TAA 
3927 

Leu Tyr Arg Arg Met Cys Glu Ala Gin Ala Leu Asp 
1300 1305 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHTUIACTERISTICS : 

(A) LENGTH: 1308 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Arg Arg Leu Gly Pro Ser Val Tyr Arg Arg Ser Asp Val Ser Thr 
1 5 10 15 

Leu Lys Lys Lys Lys Leu Ser Leu Ser Pro Ser Ser Cys Ser Thr Ala 
20 2 5 3D 

7U.a Val Pro Asp Ser Val Ser Gly Arg Val Asp His Gin Cys Thr Met 
35 40 45 

His Gly Gly Ala Ser Gly Arg Gly Arg Gly Gly Ser Lys Leu Trp Arg 
50 55 60 

lie Gin Gly Ala Lys Leu lie Cys Ser Arg Lys Arg Gly Ser Leu His 
65 70 75 80 

Ser Pro Ala Gly Gin Asn Leu Ser Phe Arg Pro Leu Leu Ser Leu Leu 

85 90 95 

His Ala Pro Leu Glu Gin Glu Leu Arg Phe Lys Thr Ser Ser Ser Ala 
100 105 110 

Ser Ser Ser Pro Ser Ser Pro lie Ser Pro Thr Glu Ser Gin Arg Arg 
115 120 125 

Gin Thr Phe Val Thr Met Pro Pro Ser Trp Arg lie Leu Tyr Phe Val 
130 135 140 

Tyr Leu Gly lie Ala Arg Leu Val Leu Ser Tyr Thr Tyr Asn Thr Leu 
145 150 155 160 

Leu Thr Tyr Ala Ala Tyr Axg lie Val Arg Asn lie Arg His Ala Tyr 
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165 170 175 

Leu Lys Ala Ala Leu Ser Gin Glu Val Ala Tyr Tyr Asp Phe Gly Ser 
180 185 190 

Gly Gly Ser He Ala Ala Gin Ala Thr Ser Asn Gly Lys Leu He Gin 
195 200 205 

Ala Gly Ala Ser Asp Lys He Gly Leu Leu Phe Gin Gly Leu Ala Ala 
210 215 220 

Phe Val Thr Leu Ser Leu Ser Arg Leu Trp Cys Lys Trp Lys Leu Thr 
225 230 235 240 

Leu He Cys He Cys He Pro Val Ala Thr He Gly Thr Thr Gly Val 

245 250 255 

Val Ala Ala Val Glu Ala Gly His Glu Thr Arg He Leu Gin He His 
260 265 270 

Ala Gin Ala Asn Ser Phe Ala Glu Gly He Leu Ala Gly Val Lys Ala 
275 280 285 

Val His Ala Phe Gly Met Arg Asp Ser Leu Val Arg Lys Phe Asp Glu 
290 295 300 

Tyr Leu Val Glu Ala His Lys Val Gly Lys Lys He Ser Pro Leu Leu 
305 310 315 320 

Gly Leu Leu Phe Ser Ala Glu Tyr Thr He He Tyr Leu Gly Tyr Gly 

325 330 335 

Leu Ala Phe Trp Gin Gly He His Met Phe Gly Arg Gly Glu He Gly 
340 345 350 

Thr Ala Gly Asp He Phe Thr Val Leu Leu Ser Val Val He Ala Ser 
355 360 365 

He Asn Leu Thr Leu Leu Ala Pro Tyr Ser He Glu Phe Ser Arg Ala 
370 375 380 

Ala Ser Ala Ala Ala Gin Leu Phe Arg Leu He Asp Arg Glu Ser Glu 
385 390 395 400 
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lie Asn Pro Tyr Gly Lys Glu Gly Leu Glu Pro Glu Arg Val Leu Gly 

405 410 415 

Asp Val Glu Leu Glu Asn Val Thr Phe Ser Tyr Pro Thr Arg Pro Gly 
420 425 430 

lie Thr Val Leu Asp Asn Phe Ser Leu Lys Val Pro Ala Gly Lys Val 
435 440 445 

Thr Ala Leu Val Gly Gin Ser Gly Ser Gly Lys Ser Thr lie Val Gly 
450 455 460 

Leu Leu Glu Arg Trp Tyr Asn Pro Thr Ser Gly Ala lie Arg Leu Asp 
465 470 475 480 

Gly Asn Leu lie Ser Glu Leu Asn Val Gly Trp Leu Arg Arg Asn Val 

485 490 495 

Arg Leu Val Gin Gin Glu Pro Val Leu Phe Gin Gly Ser Val Phe Asp 
500 505 510 

Asn lie Arg Tyr Gly Leu Val Gly Thr Pro Trp Glu Asn Ala Ser Arg 
515 520 525 

Glu Glu Gin Met Glu Arg Val Gin Glu Ala Ala Lys Leu Ala Tyr Ala 
530 535 540 

His Glu Phe lie Ser Glu Leu Thr Asp Gly Tyr Asp Thr Leu lie Gly 
545 550 555 560 

Glu Arg Gly Gly Leu Leu Ser Gly Gly Gin Lys Gin Arg Val Ala lie 

565 570 575 

Tlla Arg Ser Val Val Ser Gin Pro Lys Val Leu Leu Leu Asp Glu Ala 
580 585 590 

Thr Ser Ala Leu Asp Pro His Ala Glu Thr lie Val Gin Lys Ala Leu 
595 600 605 

Asp Lys Ala Ala Glu Gly Arg Thr Thr lie Val lie Ala His Lys Leu 
610 615 620 
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Ala Thr lie Arg Lys Ala Asp Asn He Val Val Met Ser Lys Gly His 
"5 630 635 640 

He Val Glu Gin Gly Thr His Glu Ser Leu He Ala Lys Asp Gly Val 

645 650 655 

Tyr Ala Gly Leu Val Lys He Gin Asn Leu Ala Val Asn Ala Ser Ala 
660 665 670 

His Asp Asn Val Asn Glu Glu Gly Glu Gly Glu Asp Val Ala Leu Leu 
675 680 685 

Glu Val Thr Glu Thr Ala Val Thr Arg Tyr Pro Thr Ser He Arg Gly 
690 695 700 

Arg Met Asn Ser He Lys Asp Arg Asp Asp Tyr Glu Asn His Lys His 
''OS 710 715 720 

Met Asp Met Leu Ala Ala Leu Ala Tyr Leu Val Arg Glu Cys Pro Glu 

725 730 735 

Leu Lys Trp Ala Tyr Leu Val Val Leu Leu Gly Cys Leu Gly Gly Cys 
740 745 750 

Ala Met Tyr Pro Gly Gin Ala He Leu Met Ser Arg Val Val Glu Val 
755 760 765 

Phe Thr Leu Ser Gly Asp Ala Met Leu Asp Lys Gly Asp Phe Tyr Ala 
770 775 780 

Ser Met Leu He Val Leu Ala Ala Gly Cys Leu He Cys Tyr Leu Ala 
'^^^ 790 795 800 

Val Gly Tyr Ala Thr Asn Thr He Ala Gin His Leu Ser His Trp Phe 

805 810 815 

Arg Arg Leu He Leu His Asp Met Leu Arg Gin Asp He Gin Phe Phe 
820 825 830 

Asp Arg Glu Glu Asn Thr Thr Gly Ala Leu Val Ser Arg He Asp Ser 
835 840 845 

Tyr Pro His Ala He Leu Glu Leu Met Gly Tyr Asn He Ala Leu Val 
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850 855 860 

Val lie Ala Val Leu Gin Val Val Thr Cys Gly lie Leu Ala lie Ala 
865 870 875 880 

Phe Ser Trp Lys Leu Gly Leu Val Val Val Phe Gly Gly lie Pro Pro 

885 890 895 

Leu Val Gly Ala Gly Met Val Arg lie Arg Val Asp Ser Arg Leu Asp 
900 905 910 

Arg Gin Thr Ser Lys Lys Tyr Gly Thr Ser Ser Ser lie Ala Ser Glu 
915 920 925 

Ala Val Asn Ala lie Arg Thr Val Ser Ser Leu Ala lie Glu Glu Thr 
930 935 940 

Val Leu Arg Arg Tyr Thr Glu Glu Leu Asp His Ala Val Ser Ser Ser 
945 950 955 960 

Val Lys Pro Met Ala Ala Thr Met lie Cys Phe Gly Leu Thr Gin Cys 

965 970 975 

lie Glu Tyr Trp Phe Gin Ala Leu Gly Phe Trp Tyr Gly Cys Arg Leu 
980 985 990 

Val Ser Leu Gly Glu Thr Ser Met Tyr Ser Phe Phe Val Ala Phe Leu 
995 1000 1005 

Ser Val Phe Phe Ala Gly Gin Ala Ser Ala Gin Leu Phe Gin Trp Ser 
1010 1015 1020 

Thr Ser lie Thr Lys Gly lie Asn Ala Thr Asn Tyr lie Ala Trp Leu 
1025 1030 1035 1040 

His Gin Leu Gin Pro Thr Val Arg Glu Thr Pro Glu Asn His Asp Lys 

1045 1050 1055 

Gly Pro Gly Ser Gly Ala Pro lie Ala Met Asp Asn Val Arg Phe Ser 
1060 1065 1070 

Tyr Pro Leu Arg Pro Asp Ala Pro lie Leu Lys Gly Val Asn Leu Lys 
1075 1080 1085 
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lie Asn Lys Gly Gin Phe He Ala Phe Val Gly Ser Ser Gly Cys Gly 
1090 1095 1100 

Lys Ser Thr Met He Ala Met Leu Glu Arg Phe Tyr Asp Pro Thr Thr 
1105 1110 1115 ^^20 

Gly Ser He Thr He Asp Ala Ser Thr Leu Thr Asp He Asn Pro He 

1125 1130 1135 

Ser Tyr Arg Asn He Val Ala Leu Val Gin Gin Glu Pro Thr Leu Phe 
1140 1145 1150 

Gin Gly Thr He Arg Asp Asn He Ser Leu Gly Asp Ala Val Lys Ser 
1155 1160 1165 

Val Ser Asp Glu Gin He Glu Ser Ala Leu Arg Ala Ala Asn Ala Trp 
1170 1175 1180 

Asp Phe Val Ser Ser Leu Pro Gin Gly He Tyr Thr Pro Ala Gly Ser 
1185 1190 1195 1200 

Gly Gly Ser Gin Leu Ser Gly Gly Gin Arg Gin Arg He Ala He Ala 

1205 1210 1215 

Arg Ala Leu He Arg Asp Pro Lys He Leu Leu Leu Asp Glu Ala Thr 
1220 1225 1230 

Ser Ala Leu Asp Thr Glu Ser Glu Lys He Val Gin Lys Ala Leu Glu 
1235 1240 1245 

Gly Ala Ala Arg Asp Gly Asp Arg Leu Thr Val Ala Val Ala His Arg 
1250 1255 1260 

Leu Ser Thr He Lys Asp Ala Asn Val He Cys Val Phe Phe Gly Gly 
1265 1270 1275 1280 

Lys He Ala Glu Met Gly Thr His Gin Glu Leu He Val Arg Gly Gly 

1285 1290 1295 

Leu Tyr Arg Arg Met Cys Glu Ala Gin Ala Leu Asp 
1300 1305 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: mRNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AUGCGGAGGC UCGGACCCUC AGUUUACCGG CGUUCGGACG UGUCUACUUU AAAAAAATU^G 
60 

AAGCUCUCGU UGUCACCAUC GUCAUGCUCG ACCGCGGCUG UACCAGACUC CGUCUCAGGA 
120 

CGAGUCGACC ACCAGUGUAC CADGCACGGA GGCGCCUCUG GUCGAGGAAG GGGAGGAAGC 
180 

AAGCUUUGGC GCAUACAAGG UGCCAAGCUG AUAUGCUCGC GCAAAAGAGG AUCUDUACAU 
240 

UCGCCGGCAG GACAGAACUU AUCCUUCAGG CCGUUGCUAU CCUUGCUGCA UGCGCCUCUG 
300 

GAGCAGGAAU UGCGCUUCAA AACCUCAUCU UCGGCCAGUU CGUCACCGUC AUCACCGAUU 
360 

UCACCAACGG AAUCUCAACG CCGGCAGACU UUCGUGACAA UGCCGCCGAG UDGGCGUAUC 
420 

CUCUACUUUG UAUACCUGGG CAUCGCGCGG CUCGUCCUCU CCUACACCUA CAACACCCUC 
480 
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CUAACCUACG CGGCCUACCG CAUCGUCCGC AAUAUCCGAC ACGCCUAUCU CAAAGCGGCG 
540 

CUGAGCCAAG AAGUGGCAUA CUACGAUUUC GGDAGCGGGG GCUCCAUCGC CGCGCAGGCA 
600 

ACOUCGAACG GCAAACUGAU CCAGGCCGGC GCCUCGGAUA AGAUCGGUCD UCUCDUCCAG 
660 

GGCCUCGCAG CAUUCGUGAC GCUUUCADUA UCGCGODUGU GGUGCAAGUG GAAACUCACU 
720 

CDGAUCUGCA UCDGCAUCCC CGUAGCCACG AUCGGCACGA CGGGGGUGGU AGCUGCGGDC 
780 

GAGGCUGGGC ACGAGACGAG GAUCUUGCAG AUACAUGCGC AGGCGAADDC GUUUGCCGAG 
840 

GGOAUUCDGG CGGGUGUGAA GGCUGDUCAU GCDUDUGGGA UGCGGGAUAG UCUGGUCAGG 
900 

AAGUUUGAUG^AADAUCUGGU GGAGGCGCAU AAGGUCGGUA AGAAGAUCUC GCCGCUGCDU 
960 

GGDCUUCUCU UCUCGGCGGA GUAUACGADC AUCUACCDUG GAUAUGGGCU GGCGUUUUGG 
1020 

CAGGGGAUCC AUADGUUCGG CAGGGGGGAG ADUGGGACDG CUGGGGAUAU CUUUACGGUU 
1080 

tJUGCDCUCUG DCGOCADUGC GUCAAUCAAC CDGACUDUAC UGGCGCCGUA UDCAAUUGAA 
1140 

UUDAGCAGGG CDGCUDCAGC GGCUGCGCAA CUGUUCCGAC DCAUAGAUCG AGAGUCDGAA 
1200 

AUCAACCCAU ACGGGAAGGA AGGCCUCGAG CCGGAACGGG UADUAGGCGA CGUCGAGCUC 
1260 

GAGAAUGUUA CGUUCUCGUA UCCCACGAGG CCGGGGAUUA CCGUCCDCGA UAACUUCAGD 
1320 

CUCAAGGUCC CAGCGGGAAA GGUGACUGCC CUGGUAGGGC AAUCUGQADC GGGGAAGAGC 
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1380 

ACGAUCGUGG GAUDGCUCC3A GCGGUGGUAU AACCCGACCU CUGGGGCGAU CAGACDCGAC 
1440 

GGGAACCUGA UCAGUGAGCU CAAUGUUGGC UGGCUGCGGA GGAAUGUGCG GCDCGDACAG 
1500 

CAGGAGCCGG UGCDCOUCCA GGGAAGCGUG UUCGAUAACA DCAGGDACGG CCUCGDCGGG 
1560 

ACGCCGUGGG AGAAUGCCUC UCGGGAAGAG CAGADGGAAC GGGDGCAGGA GGCCGCGAAG 
1620 

UUGGCAUADG CGCACGAAUU CAUCUCUGAG CUGACCGACG GAUACGAUAC GCDGAUCGGC 
1680 

GAACGGGGUG GDCDGCDUUC DGGAGGCCAG AAGCAGCGGG UUGCGADUGC CCGCAGCGUC 
1740 

GUUOCUCAAC CGAAGGUCCU UCUGCUGGAD GAAGCAACCA GDGCUCUUGA UCCGCAUGCA 
1800 

GAGACGAUUG DDCAGAAGGC UCUGGACAAA GCAGCUGAGG GGCGCACGAC GADUGUCAUU 
1860 

GCUCACAAAC UUGCUACGAU CCGCAAGGCG GACAAUAUC6 UUGUCAUGAG CAAGGGUCAC 
1920 

ADUGUCGAGC AAGGCACACA CGAGUCACUG AUAGCCAAGG ACGGCGUCDA UGCCX3GUCDG 
1980 

GUCAAAAUCC AGAACCUGGC AGUGAAUGCU UCAGCACAUG ACAAUGUAAA UGAGGAGGGU 
2040 

GAAGGCGAAG AUGUCGCUCU CCUGGAGGDC ACCGAAACAG CAGUAACCCG CUACCCAACC 
2100 

UCCAUCCGCG GUCGAAUGAA COCCAUAAAG GACCGCGACG AUUAUGAGAA CCACAAGCAC 
2160 

AUGGADADGC UGGCCGCCUU AGCUDAUCUC GUCCGCGAAU GUCCAGAACU GAAAUGGGCC 
2220 
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UAUCDCGUCG UGCUACDGGG GDGDCUDGGU GGUUGCGCCA UGUACCCCGG CCAAGCUADC 
2280 



UUGAUGDCUC GCGUDGDCGA GGUCDUCACG CUCUCGGGAG ACGCUAUGCU AGACAAAGGA 
2340 



GACUDCUAUG CCAGUAUGCU GADCGDUCUC GCGGCCGGGD GUCDGAUCUG UUACUUAGCU 
2400 



GUCGGAUAUG CAACCAACAC UAUAGCCCAG CAUCUUAGUC ADUGGUUUCG ACGCCUCAOU 
2460 



CUGCACGACA DGCUGCGACA GGAUAUCCAG UUCUDUGACC GDGAAQAGAA CACUACCGGU 
2520 



GCGCUGGUAA GCCGDAUCGA UDCGUACCCG CAUGCAADDC UCGAACUGAU GGGCDACAAC 
2580 



ADCGCCCUGG UCGUGAUUGC UGUCCUGCAG GUGGUAACCU GUGGCADCCU GGCCAUUGCA 
2640 



UUCUCCUGGA AACOAGGGCU GGUCGUOGUC UUDGGCGGUA UDCCACCCCU UGUCGGUGCD 
2700 



GGGAUGGUAC GAADCCGCGU CGACUCCCGC CUCGAUCGCC AGACADCGAA GAAAUAUGGC 
2760 

ACCAGCUCGD CCAUUGCCDC UGAAGCDGDA AACGCUAUCC GGACCGDDUC GUCCCUUGCA 
2820 

AUCGAAGAQA CGGUGCUACG UCGAUACACG GAGGAACUAG ACCACGCUGU CUCGUCUUCG 
2880 

GUGAAACCCA UGGCUGCCAC GADGADDUGU UDCGGGCUGA CGCAGDGCAU UGAGUACUGG 
2940 

UUDCAGGCGC DGGGAUDCDG GOAUGGGUGU CGUCUUGUGU CGCDGGGGGA GACUAGCAUG 
3000 

UAUAGUUDCU DDGUCGCAUD CCUC7VGUGUG UUCDUUGCGG GUCAGGCGUC AGCGCAGCUG 
3060 
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UUCCAGUGGU CGACCAGUAU UACAAAGGGA AUCAAUGCGA CGAACUACAU CGCUUGGUUG 
3120 

CACCAGCUCC AACCAACAGU GCGCGAGACG CCGGAGAACC ACGAUA7AGG CCCUGGAUCD 
3180 

GGGGCGCCGA UUGCUAUGGA CAAUGUGCGC UUCUCGDACC CUCUACGGCC AGACGCCCCD 
3240 

AUCCUGAAAG GGGUGAAUCU GAAGAUAAAC AAAGGCCAAU UCAUCGCUUU CGUCGGCDCC 
3300 

UCCGGCUGCG . GCAAAUCCAC CAUGAUUGCC AUGCUCGAGC GCUUCUACGA UCCAACAACA 
3360 

GGGAGCAUCA CAAUCGACGC UUCCACCCUC ACCGACAUAA ACCCCAUAUC CUACCGAAAU 
3420 

AUUGUGGCAC UGGUGCAGCA AGAGCCAACC CUUUUCCAAG GGACAAUACG GGACAACADC 
3480 

UCGCUUGGCG AUGCAGUGAA GUCCGUGUCU GAUGAGCAGA UUGAGUCGGC CCUCCGCGCA 
3540 

GCUAAUGCCU GGGACUUUGU CUCCUCAUUG CCGCAGGGGA UCUACACGCC CGCUGGCUCA 
3600 

GGCGGGUCCC AACUCUCUGG GGGGCAGCGG CAACGCAUUG CCAUUGCCCG CGCGCUCAUC 
3660 

CGAGAUCCAA AGAUCUUACU CCUUGACGAG GCUACGAGUG CCCDGGAUAC AGAGAGUGAG 
3720 

AAGAUCGUGC AGAAGGCUCU CGAGGGGGCG GCCAGGGACG GGGACCGGCU UACGGUUGCU 
3780 

GUUGCGCAUC GAUUAAGCAC GAUUAAGGAU GCUAAUGUUA UCUGUGUAUU CUUUGGAGGA 
3840 

AAGAUUGCGG AGAUGGGAAC GCAUCAAGAG UUAAUAGUUA GGGGGGGGCU GUAUAGACGG 
3900 

AUGUGUGAGG CGCAGGCCDU GGAC 
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